A Study of Likelihood Ratio Calibration in High Vocal Effort Speech for a Modern Automatic Speaker Recognition System

نویسندگان

  • Miranti Indar Mandasari
  • Rahim Saeidi
  • David A. van Leeuwen
چکیده

The production of speech is not only influenced by various intrinsic factors such as semantics, dialect, human perspective and emotion, but also by extrinsic factors such as environmental conditions and transmission channel. In certain acoustic conditions, the vocal effort of a speaker tends to be raised in order to overcome environmental hindrances such as a presence of noise or a long distance between the speaker and listener. There have only been a few studies on speaker recognition under non-neutral speech production conditions (i.e., high or low vocal effort and speech under stress) (Hansen, 2011). However, in real forensic cases, it can occur that the incriminating recording is made with high vocal effort, which then has to be dealt with in speaker comparison.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Investigating the COG ratio as feature for speaker verification on high-effort speech

Vocal effort mismatch in training and test data leads to immense degradations of speaker recognition systems. The changes on the acoustics of a speech signal induced by raised vocal effort are complex and despite several studies from various authors not completely known yet. Instead of just gaining knowledge about these differences for automatic speaker recognition it is rather an essential to ...

متن کامل

تخمین سریع ضرایب پیچش در هنجارسازی طول مجرای صوتی با استفاده از امتیاز به دست آمده از مدلسازی تشخیص جنسیت

The performance of automatic speech recognition (ASR) systems is adversely affected by the variations in speakers, audio channels and environmental conditions. Making these systems robust to these variations is still a big challenge. One of the main sources of variations in the speakers is the differences between their Vocal Tract Length (VTL). Vocal Tract Length Normalization (VTLN) is an effe...

متن کامل

Effective Segmentation based on Vocal Effort Change Point Detection

Non-neutral speech data has a strong negative impact on speech processing systems such as Automatic Speech Recognition (ASR) or speaker ID systems [1]. It is therefore necessary to detect and segment non-neutral speech data before further processing steps. Alternatively, the detection and segmentation of non-neutral speech segments from an input speech stream can be used in speech analysis and ...

متن کامل

Speaker Line-up Calibration of the i-vector Based Speaker Recognition System for Forensic Application

An automatic speaker recognition (ASR) system must produce reliable likelihood ratios (LR) in order to be used for evaluating and presenting speech evidence to court. The LR is only reliable if it produced from a well-calibrated ASR. A study by Rodriguez (2007) showed that the LR calculated from the un-calibrated system was often misleading, while the calibrated system produced more reliable LR...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012